home *** CD-ROM | disk | FTP | other *** search
-
- This library contains scripts which can be used, in conjunction with the widely
- available public-domain disassembler 'ASMGEN.COM', produce fairly intelligible
- source for the PC-DOS v3.0 system files IBMBIO.COM, IBMDOS.COM, & COMMAND.COM.
- Tactfully, one might state that their use is intended to 'augment and clarify'
- the technical documentation provided by IBM and Microsoft.
-
- The user is assumed to have access to and be familiar with v2.01 of ASMGEN; if
- the latter assumption holds, it may also follow that s/he is familiar with some
- of its limitations, as well. Those limitations that are pertinent here
- include :
-
- 1) The destination operand of LES instructions is incorrectly
- generated as a byte, vice word, register. As per Intel CodeMacros,
- the following equivalences hold :
-
- Byte Word
- ---- ----
- AL AX
- CL CX
- DL DX
- BL BX
- AH SP
- CH BP
- DH SI
- BH DI
-
- Thus, read 'LES DI,[data]' for 'LES BH,[data]', etc. The LES
- instruction is an unfortunate exception to the general rule on
- the 8086, where even opcodes take byte arguments.
-
- 2) The arguments to inter-segment direct CALLs and JMPs are not
- generated correctly by the disassembler. I can't think of any
- instance where either of these instructions occurs in any of the
- DOS 3.0 system files; if you encounter any, you'll have to use
- a debugger to provide the correct address.
-
- 3) Most significantly, ASMGEN's ability to deal with segmentation
- and relocation is somewhat limited: immediate word data in a segment
- other than the first in the file is generated inconsistently, although
- the program will often generate the 'real' operand as a comment. In
- some cases, this manifests itself in generated output such as :
-
- ...
- L1190 SEGMENT
- ASSUME ...
- ...
- L1234 CMP AX, OFFSET L1190 ;0000
- ...
- In other cases, the immediate operand is "clobbered" a bit more
- thoroughly, i.e, made negative and large. This limitation is
- encountered when disassembling IBMBIO and COMMAND, since both contain
- large sections of code which is executed at an offset other than that
- at which it is stored in the .COM file. I found the behavior of the
- suggested remedies (use of "d", "/l-", etc., in the .SEQ file) to
- be inconsistent with the ASMGEN documentation and eventually abandoned
- all attempts to deal with this problem; if you have any success with
- it, I'd appreciate hearing of it. In the meantime, view all immediate
- 'offsets' in listings of relocated code with sympathy and keep a hex
- calculator handy.
-
- Problem (3) can be circumvented by splitting COMMAND and IBMBIO into two parts,
- treating the relocated code as a large block of 'byte data' in the original,
- and using the .SEQ files as a guide to disassembling the 'new' file which
- contains the relocated code. The relocated portion of IBMBIO (code which
- loads & interprets CONFIG.SYS) executes at offset 0000 within its segment;
- the 'transient portion of COMMAND.COM' executes at offset 0x0100. This
- procedure will, of course, clarify the internal structure (intra-segment) of
- the relocated procedure but may obscure its interaction with the 'resident'
- segments of the programs.
-
- The scripts provided are named BIO3.SEQ (for IBMBIO), DOS3.SEQ (for IBMDOS),
- and COMMAND3.SEQ (for COMMAND.COM). I chose these names so that I could
- work with them on a DOS 2.x system, calling the executable files BIO3.ABS,
- DOS3.ABS, and COMMAND3.COM. The disassembler cares only that the .SEQ file
- and the file to be disassembled have the same name and can be found on the
- same drive when it is invoked. ATTRIB (or, for the brave, DEBUG) will have
- to be used to un-hide the first two files; remember to use caution and to use
- a backup DOS disk.
-
- I claimed earlier that the output of all this was 'fairly intelligible'; I can
- substantiate this by noting that my own understanding of DOS' behavior has
- benefited from the exercise. The work presented here is far from complete;
- hopefully, those interested in a greater level of detail can productively use
- this information as a starting point.
-
-
- Discoveries Big and Small, in no Particular Order :
- ----------- --- --- ------ -- -- ---------- ----- -
-
- 1) All of EXEC (int 21/4b) is now in low memory (in IBMDOS.) This makes
- the task of writing an alternate shell a bit less gruesome than it has
- seemed on previous versions of PC-DOS, although many versions of MS-DOS
- have seen it as the responsibility of the operating system, and not of its
- command interpreter, to load and execute programs. Perhaps TOPVIEW has a
- component which replaces COMMAND ?
-
- If I read things right, IBMBIO 'EXECs' COMMAND (or whatever) 'by hand', i.e.,
- without going through function 4b. Not only would this have made too much
- sense, but the initial invocation of COMMAND.COM must be made with an
- Environment address of 0; this is part of the means COMMAND uses to determine
- that it IS its initial invocation and that user requests to EXIT should be
- ignored. One of the first things COMMAND does is to allocate an environment
- which contains a) a null PATH and b) COMSPEC pointing to itself. A smarter
- shell could probably live without either of these, although it should probably
- allocate something which will pass EXEC's criteria of a 'valid environment.'
-
- 2) There is (and has been at least since PC-DOS 2.10/MS-DOS 2.11) something
- fishy about the 'CP/M-86 style' function calling mechanism (call psp:0005).
- If you've ever tried to use this mechanism to call old-style functions and
- found it a bit fishy yourself, read on :
-
- There's an entry point in IBMDOS, just above the regular 'INT 21' entry, which,
- after verifying that CL contains a function # <= 0x24, pops a long return
- address from the stack, pushes the flags, moves CL to AH, and falls into the
- INT 21 code near its beginning; when the IRET eventually occurs, it will cause
- a return to the caller's caller. Early on in the initialization part of
- IBMDOS, it sets up a far JMP to this entry point at 0000:00C0, down where INT
- 30 and INT 31 would be vectored, were they not 'reserved for DOS.'
-
- The problem is that, due to some unfortunate arithmetic, the code DOS puts at
- offset 0005 of the PSP effectively does a long call to 0000:00BE, which is
- where the segment portion of the INT 2F address should be stored. If this word
- contains a 0, there will be a side effect of executing an 'ADD [BX+SI],AL'
- before executing the long jump; depending, of course, on the contents of these
- registers, this may be less than desirable. Worse yet, IBMDOS v3.0 initializes
- INT 2F to point to an IRET within itself, and this interrupt may be re-vectored
- if SHARE.EXE is invoked; thus, the chances of ever even reaching the long JMP
- become increasingly remote. Whenever a new PSP is created, the old one, with
- this bogus CALL at 0005, is copied into it; one need find the point when the
- first PSP is created in order to find the source of this bug.
-
- This is all kind of unfortunate; there are some excellent 8080-model CP/M-86
- programs (including an INTERACTIVE disassembler) which would be trivial to port
- to DOS if this calling mechanism worked (e.g., INT E0 -> PUSH CS, PUSH 5,
- RET FAR.)
-
- 3) The functions of interrupts 2A and 2F are a bit more extensive than might
- have previously been indicated (see articles in PC Tech Journal [12/84, pp. 74-
- 75, 102-103].) Both are initialized to point to IRETs; SHARE.EXE steals 2F,
- and one must assume that the 'PC NETWORK PROGRAM' will steal 2A. Calls to
- both functions are scattered throughout the code, but note the 'net_dummy_xxxx'
- routines in IBMDOS. Note also that a table of the 'dummy' routines is main-
- tained, as if one could overwrite those addresses with NOPs and usher in 3.1.
- (I looked for such a routine, curious as to what might trigger it and what its
- effect might be. I didn't find it, but that certainly doesn't mean that its
- not there.)
-
- 4) Enough ! If you're sick and twisted enough to have read this far, you are
- a mutant and will go far in computing, to paraphrase an old Apple reference
- manual. Hopefully, the disassemblies produced by these scripts will enable
- you to understand DOS 3.0 ( & 2.x & 3.1) a bit better; if you discover any
- features/bugs which you find interesting, please pass them on.
-
- Documenting DOS adequately is a dirty job, but someone has to do it. I hope
- that this will help to get things started.
-
- Gary Byers
- CIS [74345,353]